AnCoraPipe: A new tool for corpora annotation
نویسندگان
چکیده
This paper describes AnCoraPipe, an environment for the creation, edition and analysis of linguistic corpora and lexicons. AnCoraPipe has been used in the development of different linguistic resources: AnCora, CesCa, ClInt, Amazighe corpora, and the verbal and nominal AnCora lexicons. We present the functionalities of AnCoraPipe, the way in which the data and metadata is structure, as well as some implementation details.
منابع مشابه
AnCoraPipe: A tool for multilevel annotation
AnCoraPipe is a corpus annotation tool which allows different linguistic levels to be annotated simultaneously and efficiently, since it uses a single format for all stages. In this way, the required annotation time is reduced and the integration of the work of all annotators is made easier.
متن کاملA New Annotation Tool for Aligned Bilingual Corpora
This paper presents a new annotation tool for aligned bilingual corpora, which allows the annotation of a wide range of information, ranging from information about words (such as part-of-speech tags or named-entities) to quite complex annotation schemas involving links between aligned segments, such as co-reference or translation equivalence between aligned segments in the two languages. The an...
متن کاملCoreference Annotator - A new annotation tool for aligned bilingual corpora
This paper presents the main features of an annotation tool, the Coreference Annotator, which manages bilingual corpora consisting of aligned texts that can be grouped in collections and subcollections according to their topics and discourse. The tool allows the manual annotation of certain linguistic items in the source text and their translation equivalent in the target text, by entering usef...
متن کاملMMAX: A Tool for the Annotation of Multi-modal Corpora
We present a tool for the annotation of XMLencoded multi-modal language corpora. Nonhierarchical data is supported by means of standoff annotation. We define base level and suprabase level elements and theory-independent markables for multi-modal annotation and apply them to a cospecification annotation scheme. We also describe how arbitrary annotation schemes can be represented in terms of the...
متن کاملSegProso: A Praat-Based Tool for the Automatic Detection and Annotation of Prosodic Boundaries in Speech Corpora
In this paper we describe SegProso, a Praat-based tool for the automatic segmentation in prosodic units of speech corpora. It is made up of a set of Praat scripts that add several tiers, each one containing the segmentation of a different unit, to a previously existing TextGrid file including the phonetic segmentation of the associated wav file. It has been successfully used for the annotation ...
متن کامل